Thursday, July 07, 2016

Understanding QMC algorithms

I started working on QMCPACK earlier this year.  As usual when diving into an unfamiliar code base, one of the first challenges is to understand how the code works.   Scientific codes can take more work to comprehend because they implement complex algorithms, and one must understand the underlying algorithms in addition to the code structure.  Reverse-engineering the algorithm from the code is usually challenging.  Hopefully there's an associated paper that describes the algorithm.  However, publications give a high-level view of the algorithm and the mathematics, but there's usually a gap between the paper and the details of the implementation.

Once the code and algorithm are understood, the next question is how to best test the code.   Ideally the original implementer should leave behind code and documentation for this purpose, but usually what we have is the final code and a paper.    What other kinds of artifacts (code, documentation, data, test cases,  etc.) should an implementer produce to help with this?

In an attempt to address these issues, I created a repository with various Python scripts and Jupyter (IPython) notebooks covering different algorithms used in a Quantum Monte Carlo code.   The repository is here: https://github.com/markdewing/qmc_algorithms

Some more benefits and goals of these scripts include

Ease of comprehension

Notebooks show the derivations and demonstrate the algorithm working in a simpler context.  These aim to be a bridge between the paper or text description of an algorithm and the final production implementation.

Symbolic Representation

Represent the various equations in a more symbolic form.  This allows mathematics-aware transformations, such as taking derivatives.   See the Jastrow factor in Wavefunctions/Pade_Jastrow.ipynb for an example.   Also the wavefunctions and expressions for the energy in Variational/Variational_Hydrogen.ipynb and Variational/Variational_Helium.ipynb.

Testing

These scripts can be used for testing in several ways.  One is to evaluate the expressions at a single point to check values in unit tests.

Eventually the symbolic or script representation should be compared against the code for a wider range of values and parameters.  This is similar to property-based testing in software engineering, but there are a couple of issues to address.  The first is infrastructure support.  Most property-based testing frameworks focus on integer parameters, like array lengths, and not so much on floating point values.  A second issue is cross-language coding.  In our case, the script representation is in Python and the code to test is in C++. 

Also for testing, simpler or slower algorithms can be used to verify more complex algorithms.  For example, the repository contains a python script for Ewald summation (LongRange/ewald_sum.py) that uses the standard break-up between short and long range pieces.  QMCPACK optimizes this split between the short and long range pieces, but the result should still be the same, up to some error tolerance.

Code generation

Ultimately, code generation can be used to connect the symbolic form to the executable code. This will a require a maintainable mechanism for doing the code generation (it's easy to start doing code generation, but there can be problems with scaling it and keeping it neat, clean, and understandable. ) The previous step - using the symbolic forms for testing - seems like an easier intermediate path, and is better suited for testing existing code.

Summary

This is an experiment in creating artifacts that enhance comprehension and testing of scientific programs.  Take a look at the repository ( https://github.com/markdewing/qmc_algorithms ) and see if you find it useful.


Monday, January 18, 2016

Reduced-order models in QMC

After the last post about multiscale modeling, let's explore how some of the methods and ideas might apply to improving QMC.

Possible sources:
- Reduced order models.  Usually applied to time-dependent problems.
- Dimensionality reduction.  From statistics and machine learning
- Low-rank matrix approximation.
- Compressed sensing - determine a sparse sampling before performing the sampling.  See this article for an application to wavefunctions.

Some of these approaches are applications to different domains but use similar underlying mathematics (e.g., apply Singular Value Decomposition (SVD) to an appropriate matrix).

The "order" in "reduced-order models" refers to the complexity of the model.  Some possibilities for QMC on how the complexity might be varied include: number of electrons, number of sample points in the integral, quality of the wavefunction, and the precision of the arithmetic.

Let's look at each of these in a little more detail:

1. Number of electrons
   A. Integrate out some electrons - yields pseudopotential
   B. Integrate out all electrons - yields intermolecular potentials between atoms/molecules

 To ultimately reduce the complexity of the model, most or all of the electrons need to be eliminated.  This is also the natural decomposition at an initial look - generate potential from electronic structure calculation.    However, this may be too large of jump of scales in some cases, and it may be useful to find intermediate models that still include some or all of the electrons.

2. Number of sample points in the integral
    A. Subset of points from an existing set of samples.  Not so useful for that particular run, but could be useful for a nearby system. Possible ways a system could be 'nearby':
         - Changing parameters for VMC optimization
         - Core electrons for psuedopotentials
         - Weak intermolecular interactions
   B. Generate new points faster or more accurately
        Quasi Monte Carlo methods (the other QMC) or other low discrepancy sequences.

Side note on pseudopotentials.  They serve two purposes:
   a. Reduce the number of particles that need to be handled
   b. Reduce the variance introduced by sampling the core electrons

  I suspect that (b) is more important than (a), at least for small problems.  Pseudopotentials complicate various QMC algorithms, so improved sampling that reduces the variance could be an effective alternative.

3. Quality of wavefunction
  In general we would like to have a systematic way of controlling the quality (computational complexity vs. error) of the wavefunction.   One use for less-accurate approximations to the wavefunction might be as a guide for faster generation of new sample points.

4. Precision of arithmetic
  The full model is really the position of all the electrons in space.   We might consider restricting the electrons to lattice points or reducing the precision in places.   For examples in machine learning, see here and here.


Thursday, December 31, 2015

Towards Understanding Multiscale Modeling

I've been interested in multiscale modeling lately, and this post is an attempt to collect some resources and clarify some of my understanding of the topic.

Some places to start are the wikipedia entry and the scholarpedia entry.   Go more in-depth with the book "Principles of Multiscale Modeling". Also this review (pdf) (From 2000, but still a good overview of an array of fields and problems).

There are some other 'multi's that may be encountered in modeling:
  • Multiphysics - combining multiple physical models in a single simulation.  For instance, a thermal model and a mechanical model where the mechanical properties are temperature-dependent.  Usually the models occur on the same length scale (vs. multiscale)
  • Multigrid - method of accelerating a solution and bridging the largest and smallest lengths in a single model.  Typically using grids of different resolution.
  • Multiscale - combining models at different length (or time, energy) scales

What defines a 'scale'?

Each scale consists of a self-contained theory (or model),  approximations for solution (often these approximations come to define variants of each model), and validation by experiments (carefully designed to probe at this scale, and isolate the influence of other scales).

Transitioning between scales often involves a qualitative change in the type of theory (particle-based, continuum modeled with PDE, network of coupled ODE's, etc.)   The "self-contained" part usually implies a small number of input parameters.  Some of this is practical - if the scale had too many inputs it would be hard to understand, hard to reason about, and hard to compute.
The simplest type of multiscale modeling is sequential (also called 'hierarchical'), where a lower level model is used to derive parameters for a higher level model.  For instance, using electronic structure methods to find parameters for empirical potentials in an atomistic simulation.
The other type is concurrent modeling (in some cases called 'on the fly'), where simulations at different scales are done at the same time.  In the example above, the atomistic simulation would call down to the electronic structure simulation as needed to get necessary parameters.  Alternately, if it is not possible to parameterize the potential, the atomistic simulation would get forces and energies directly from the electronic structure simulation.

Multiscale in Materials

This whole process seems most developed for engineering certain metals and alloys, where it's called Integrated Computational Materials Engineering.

The book "Integrated Computational Materials Engineering for Metals: Using Multiscale Modeling to Invigorate Engineering Design with Science" has some case studies, including one on the design of a Cadillac control arm, covering the scales from electronic structure (DFT) to the final mechanical design of arm.  I found it helpful to have a specific example running the entire range.

Scales encountered for materials problems include
  • electronic structure
  • atomic motion
  • dislocations
  • voids and other microstructures
  • continuum description
  • engineering design (using FEA - Finite Element Analysis)
The Minerals, Metals and Materials Society (TMS) produced a report that covers the current state and challenges for the field: "Modeling Across Scales: A Roadmapping Study for connecting materials models across length and time scales"


Multiscale in Biology

There are a lot of directions to go when building upward from electronic structure. Another possibility is biology.  This article is a good overview:
"Multiscale computational models of complex biological systems".
 Scales encountered include

  • genetic
  • protein
  • pathway and signaling
  • whole cell
  • cell network and tissue
  • whole organ
  • multisystem and organism
(And in some cases, this could go higher to populations and ecosystems)


Multiscale in Computing

It may be instructive to compare with various scales in computing
  • materials (doped silicon, etc)
  • transistors
  • gates
  • larger logic assemblies (shift registers, adders, etc)
  • processors (CPU)
  • assembly language
  • low level programming language
  • high level programming language
  • operating system
  • applications
  • large-scale distributed systems

Now perhaps not every hierarchical description of abstractions should be considered 'multiscale', but it may be worth considering the similarities and differences.

One of those differences is where engineering (control) can occur.  For computing, the entire stack is engineered.  For materials, points of control are composition, manufacturing process, and component design.  For drug design in biology, the only point of control/design is the drug molecule.  All the effects above that level are no longer under direct control - understanding how these effects propagate through the scales is very difficult problem.

General Approaches

One of the limiting factors is how much computer time it takes to solve the model at each scale. Approaches for speeding up existing models (via multigrid or reducing the degrees of freedom) start to merge with methods for generating models for higher scales.  One approach is systematic upscaling, based on multigrid and renormalization ideas.  See Principles of Systematic Upscaling (pdf).
 
Another approach is called Heterogeneous Multiscale Modeling (HMM)  See this overview page and a review article (pdf)


Connections to QMC

Given this blog is focused on QMC, how these multiscale ideas might relate to QMC.  Solving the electronic structure problem forms the lowest level of the hierarchy (for practical engineering problems, anyway).  Can we speed up QMC calculations, possibly through creating reduced-order models?  Also, can existing information (previous runs on the same system, or information from higher levels) be used to speed the calculation?

Friday, August 21, 2015

Using the density in QMC


A while ago, I made a small post wondering about Levy constrained search in QMC?
That post was idle speculation, but there have been some papers published seriously exploring the idea for combining DFT and QMC.

The series of papers:
[1] Electronic Energy Functionals: Levy-Lieb principle within the Ground State Path Integral Quantum Monte Carlo (L. Delle Site, L. M. Ghiringhelli, D. Ceperley, May 2012)

[2] Levy-Lieb Principle meets Quantum Monte Carlo (L. Delle Site, Nov 2013)

[3] Levy-Lieb principle: The bridge between the electron density of Density Functional Theory and the wavefunction of Quantum Monte Carlo  (L. Delle Site, Nov 2014)

And two related earlier papers:
Levy-Lieb constrained-search formulation as a minimization of the correlation functional (L. Delle Site, Apr 2007)
The design of Kinetic Functionals for Many-Body Electron Systems : Combining analytical theory with Monte Carlo sampling of electronic configurations (L. M. Ghiringhelli, L. Delle Site, Nov 2007)

(Dates are submission dates to arXiv, not publication dates)

Key issues when combining DFT and QMC:
  1. How to compute density functionals from a QMC calculation?
    There are two ways to go about this:
    1. Invert the samples to find a functional.
      To understand this we need to know a little bit about Orbital-Free Density Functional Theory.   While the foundation of Density Functional Theory is that the ground state energy is a functional of the density, in practice it is a complicated functional (particularly the kinetic energy piece).   The standard approach is to reintroduce orbitals, which can accurately compute the kinetic energy but also at a cost in performance.   Finding a sufficiently accurate functional without needing orbitals could greatly speed up DFT calculations.  Machine learning techniques are starting to be used to compute complicated functionals (see Finding Density Functionals with Machine Learning )
      This approach could yield useful, transferable functionals.
    2. Use the QMC sampling to perform the integral over the functional (see [2]).   There is no need for an explicit expression of the functional.
  2. How to use a given density in a QMC calculation?
    In [1], the authors propose the use of Ground State Path Integrals (GSPI), with some additional thoughts on how to sample at fixed density in [2].
    The reason for using GSPI is that the trial wavefunction affects efficiency, but not necessarily the final result, and the addition of a density constraint may speed up sampling.   Compare with a VMC approach, where a minimization of the parameters at the fixed density would still be needed (and of course the accuracy would still be limited by the flexibility of the trial wavefunction).

The final proposal is to make the process iterative - use DFT to compute the density, use GSPI to compute new integrals over the functionals (as in 1B),  use those to recompute the DFT density, etc.

It seems like an intriguing proposal, and I'm curious to see how well it would work on a realistic problem ([2] proposes a helium dimer as a good test case).

Wednesday, October 05, 2011

Video of SciPy 2011 talk

The video from my SciPy 2011 talk is available here.

At the end of the talk, Andy Terrell asked a question about why this structured approach is necessary - why not just program the steps directly in a script?

I agree that the system I presented does seem very 'heavyweight'. But it feels that doing the transformations directly loses some meta-information that would be useful to keep around, such as which transformations were applied at each step. Additionally, by storing the the pieces of the derivation in data structures, they are potentially easier to manipulate (and emphasizes that they can and should be programmatically manipulated). However, Python's meta-tools are quite powerful, and could probably used to manipulate derivations written directly in a script.


Switching to a different issue, the presented example effectively inlines all the functions into the integral. This is clearly not scalable in the face of larger and more complex expressions. The system needs some way to represent pieces that should transform into a function call in the generated code. I have something working now, but it needs some clean-up work.

Tuesday, June 07, 2011

Modeling derivations: Questions and Answers

The past two posts (part I and part II) described a system (or style) for scientific programming that I've been working out. This post will have more in a Q&A format (The questions are mostly ones I ask myself - perhaps it should better be called a 'rhetoric & rant' format) An abstract for a talk and paper was accepted to SciPy 2011, so hopefully I can explain it clearly enough by then to get some feedback.

  1. What is the target application? Goals?

    The target goals are better enabling algorithm research and development (that is, writing a scientific code), and a flexibility to target different runtime systems.

    Looking at the examples and case studies on various computer algebra system (CAS) websites, I see many cases where the user modeled some system, solved it, and then used the solution to improve or fix the system. The hard part seems to be modeling the system - this is not my target. I'm interested in problems where the second step is the hard part - and one needs to research new and better solvers.

    I'm thinking in terms of a traditional scientific code, and how to automate parts of the development workflow. One big issue is how to use a CAS to assist in the development. Also I'm interested in problems that require high performance computing (HPC) - basically problems that will consume as much computing power and algorithm research as possible. This likely means the code will need to be tuned and altered to support different target systems (multicore workstations, distributed memory parallel clusters, accelerators like GPU's, etc).


  2. Why not use a compute algebra system directly? It looks like you're creating a lot of extra work for yourself?

    I would like a separation between the specification of an algorithm (for scientific computation, this is largely expressed in equations) and its implementation in the runtime system (along with an automated transformation from the spec to the final system).

    Typically a CAS is associated with full featured programming language, but implementing the algorithm in that language doesn't preserve this separation.

    As a concrete example, a CAS typically includes facilities for performing numerical integration. What if I want to develop a new numerical integration method - are there any examples on how a CAS can be used to do that?

    Another issue with implementing HPC-style problems in a CAS is what happens when the existing solvers in the CAS are too slow, or the CAS system language is too slow (eg, how to scale to large parallel systems). The CAS may do a great deal of work for you, but what happens when you want to move past it? This is the part where code generation comes in.

  3. What is the role of code generation in this system?

    More generally, the 'code generation' step is the 'transformation to the target runtime system' step. The transformation may be trivial, if the final equations can be solved by the CAS facilities, but I'm anticipating the need to ultimately target C/C++ or Fortran.

    Also, it seems useful to introduce this method in small pieces into an existing code (converting an entire code would usually be too dramatic of a change to be successful). It would smooth the transition for the generated code to fit the style of the existing code. A template system for generation would be a useful approach. A text-oriented (or basic token oriented system, like the C pre-processor) would work, but they have problems. A system oriented around an actual parse tree of the target language would allow the most powerful transformations.


  4. Is this related to reproducible research in any way?

    Packages for reproducible research seem to be focused on the level of capturing parameters that are input to a scientific code, so that the results can be repeated and revisited later.
    Executing a program may be reproducible, but the construction of that program is not.

    Aside:
    Reproducibility can be achieved through documentation or automation. The previous standard for scientific work depended primarily on documentation. With computers, automation becomes possible as a working method. (For space reasons, this analysis is very simplistic --- there is more to it.)
    </aside>

    This work is intended to make the construction of a code automated. One advantage is in making corrections to mistakes in the derivation - fix it, push a button, and the final code is automatically updated.

    This should be a complementary to other systems for reproducible research.

  5. Why are you doing this?
    I'm tired of tracking down missing minus signs and misplaced factors of 2 in my code. A significant feature of this approach is that the entire chain of derivation and generation of the program is performed by machine, to eliminate transcription errors. (If this sounds too rosy, rest assured there are plenty of other types of errors to make).

Saturday, April 23, 2011

Example of modeling derivations

The last post discussed a style of scientific programming that modeled derivations symbolically to create a machine-checked chain from starting equation to final code.


This post will provide an example using the configuration integral for the canonical ensemble in statistical mechanics. I want to evaluate the integral using various methods, starting with grid-based integration methods. Using a grid-based method suffers from the curse of dimensionality, where adding particles raises the time to compute exponentially (which is why Monte Carlo methods are normally used). However, it can still be useful to check Monte Carlo codes with other integration methods for small particle numbers. Also, the free energy (and all associated quantities) is computed directly, where it is more difficult to compute with MC methods. Finally, I'm curious as to how large of a system could be computed on today's hardware.


Sympy is the computer algebra system I'm using. The code for this example is available on github in the derivation_modeling branch, in the prototype subdirectory.


The output (MathML and generated python) can be seen here.


The flow is as follows:


  1. Derive the composite trapezoidal rule, starting from the expression for a single interval

  2. Generate code for the composite trapezoidal rule

  3. Start from the configuration integral (which I've called the partition function in the documents) and transform it by specializing various parameters - 2 particles in 2 spatial dimensions, using the Lennard-Jones interaction potential, etc - until it becomes completely specified numerically.

  4. Generate code that evaluates this integral using the generated integration code from step 2



There's still a tremendous amount of work to do on this, but hopefully an outline of this style of programming is visible from the example.



  • The MathML pages have a lot of detailed small steps - it would be nice to have a collapsible hierarchy so all the little steps can be hidden.

  • Some abstractions should remain during the steps - in particular the potential. The expression for the integrand can get quite messy from explicitly inserting the expression for the potential (and this is a simple case - more complex potentials would make it unreadable).

  • More backends for the code generation - C/C++ is the obvious choice in pursuit of higher performance. (The generated python can be run through Shed Skin, for another route to C++). Converting to Theano expressions might be a quick way to try it out on the GPU.